-
Notifications
You must be signed in to change notification settings - Fork 455
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add Lsn type #64
add Lsn type #64
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I can not agree that it is "zero-cost wrapper".
Please look at this simple program:
use std::time::Instant;
use std::ops::{Add};
#[derive(Debug, Clone, Copy, Eq, Ord, PartialEq, PartialOrd)]
pub struct Lsn(pub u64);
impl From<u64> for Lsn {
fn from(n: u64) -> Self {
Lsn(n)
}
}
impl Add<u64> for Lsn {
type Output = Lsn;
fn add(self, other: u64) -> Self::Output {
// panic if the addition overflows.
Lsn(self.0.checked_add(other).unwrap())
}
}
fn inc_lsn_struct(lsn: Lsn) -> Lsn {
lsn + 1
}
fn inc_lsn(lsn: u64) -> u64 {
lsn + 1
}
pub fn main() {
const ITERATIONS: u64 = 1000000000;
{
let now = Instant::now();
let mut lsn = Lsn(0);
for _ in 1..ITERATIONS {
lsn = inc_lsn_struct(lsn);
}
println!("Elapsed for LSN {:?} struct {:?}", lsn, now.elapsed());
}
{
let now = Instant::now();
let mut lsn : u64 = 0;
for _ in 1..ITERATIONS {
lsn = inc_lsn(lsn);
}
println!("Elapsed for LSN {:?} struct {:?}", lsn, now.elapsed());
}
}
Results at my notebooks are the following:
knizhnik@xps:~$ rustc -O call.rs
knizhnik@xps:~$ ./call
Elapsed for LSN Lsn(999999999) struct 27.057152ms
Elapsed for LSN 999999999 struct 39ns
knizhnik@xps:~$ rustc call.rs
knizhnik@xps:~$ ./call
Elapsed for LSN Lsn(999999999) struct 45.84106434s
Elapsed for LSN 999999999 struct 30.92782111s
It is about 1.5 time differences for non-optimized code and 1000 times for optimized.
I am not sure that optimizer doesn't eliminate calls at all in last case. And speed is still very high. I do not think that we will notice any difference if we use this Lsn representation.
But still it is not possible to say that cost is zero:)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, looks good to me.
impl fmt::Display for Lsn { | ||
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { | ||
// FIXME: should this be "{:X}/{:08X}" ? | ||
write!(f, "{:X}/{:X}", self.0 >> 32, self.0 & 0xffffffff) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The snprintf pattern used in postgres is "%X/%X"
, so "{:X}/{:X}"
is in line with that. That said, Postgres input functions do accept the extra 0's. And the extra zeros would be helpful when you're eyeballing different LSN values trying to see which is larger, like 0/10000000
vs 0/1200000
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll defer to the people with more postgres knowledge. Which one would you like?
} | ||
|
||
impl CacheKey { | ||
pub fn pack(&self, buf: &mut BytesMut) { | ||
self.tag.pack(buf); | ||
buf.put_u64(self.lsn); | ||
buf.put_u64(self.lsn.0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is lsn.0
needed, why not just lsn
? I thought the impl From<Lsn> for u64
trait would make that work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function takes only a u64:
fn put_u64(&mut self, n: u64);
There are a few ways to get automatic conversions, but none of them happen to be used here:
- Rust always tries to do "autoref" of the
self
argument. This is what allows you to type, e.g.myvec.len()
instead of(&my_vec).len()
. - Some functions have generic parameters that can handle the coercion, e.g. one could write:
fn put_u64<T>(&mut self, n: T)
where T: Into<u64>
{
let n: u64 = n.into();
// do stuff with n
}
- You can do a similar generic type for references using
AsRef
orAsMut
. You'll see this in the standard library a lot, e.g.File
hasopen<P: AsRef<Path>>(path: P)
, and there's a long list ofstd
types that implementAsRef<Path
.
The question mark operator (for propagating errors) uses Into<___>
for whatever your returned error type is. So even if you're not using anyhow
or thiserror
, implementing Into
(or more likely, From
) for your error types will allow you to use a variety of error types with ?
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do agree that buf.put_u64(self.lsn.0)
isn't very friendly.
However, I think that all of this manual byte-pushing code should be replaced with serde
anyway. As long as we #[derive(Serialize, Deserialize)]
on the Lsn
type then it will work exactly the same as u64 with no extra code at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not a big deal, we can live with it. I was just surprised it didn't "just work".
However, I think that all of this manual byte-pushing code should be replaced with serde anyway. As long as we #[derive(Serialize, Deserialize)] on the Lsn type then it will work exactly the same as u64 with no extra code at all.
That handles this particular case, but I would imagine that this arises in many other places, too. One example off the top of my head: what if you want to store an Lsn in an AtomicU64?
I think this works, too:
buf.put_u64(self.lsn.into());
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't found a good crate that allows Atomic<T>
where it's implemented as e.g. AtomicU64
if the size is 64-bits. I made one myself, but never quite finished it. Handling the case where T
is not Copy
gets quite nasty, but we don't need that here, so I could probably dig out the simple version (that requires T = Copy
) if we want it.
Or we could just do a quick AtomicLsn
wrapper, and wait to see how often we need to cut/paste that code...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another thought: the kind of argument mismatch that put_u64
hits gets less likely as we get to more complex types (that aren't Copy
), because they're likely to be passed by reference.
If we have a bunch of functions that take by-reference arguments, e.g.
fn handle_mytype(mt: &MyType);
Then you'll note that the mt
argument can be satisfied by any container that Deref
s to a MyType. For example, Arc<MyType>
or MutexGuard<'_, MyType>
both impl Deref
with Target=Mytype
, so you can write:
handle_mytype(&mt);
... and that code will continue to work whether mt
is a MyType
or Arc
or MutexGuard
(or some other container that we build ourselves).
So your instinct that wrapper types should "just work" is still partially true, at least for the bigger data structures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will it be better to add pack/unpack methods to Lsn?
So that it is possible to write:
self.tag.pack(buf);
self.lsn.pack(buf);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will it be better to add pack/unpack methods to Lsn?
Yes, this can be done automatically with #[derive(serde::Serialize, serde::Deserialize)]
and then we can use any serde-compatible library we like. The bincode
crate seems to implement the raw serialization we need in most places (including endian conversions).
I would need a bit more time to look at the code, but the end result is that we should be able to write things like my_data_structure.serialize(&mut buffer)
without ever needing to type put_u64
again.
There is one case where I'm not certain about, and that's when the network packet includes little-endian data that can be copied directly into a Rust struct with no validation. I don't think it's possible to generate that code with serde
right now (see also #6), but if the performance is important to us, I think we can build tools to do that.
Ah, but you cheated :) The It's true that you will see the extra layer of abstraction in a debug build, but it release mode Rust is very good at seeing right through abstractions like this. Zero-cost abstractions are one of Rust's primary design goals. Rust optimization is a bit more powerful than C or C++, because the type system carries a lot more information. And the Rust optimizer will never break your code the way that often happens to C/C++ programs that didn't know they were relying on some hidden undefined behavior. |
And one of these days, rust optimization will get even better! There has been a long fight against LLVM bugs, but in March the nightly compiler enabled the "mutable noalias" optimization, which means that code like this can be optimized (because mutable references are unique, these two references are known to not point to the same location):
This optimization has been enabled and reverted a few times (because of bugs in LLVM), so we might not see it in a stable release yet. But it will happen eventually. |
Sorry, I didn't expect that just one extra instruction (check overflow bit) may such dramatically affect performance. |
pageserver/src/waldecoder.rs
Outdated
if is_xlog_switch_record(&recordbuf) { | ||
trace!( | ||
"saw xlog switch record at {:X}/{:X}", | ||
(self.lsn >> 32), | ||
self.lsn & 0xffffffff | ||
); | ||
self.padlen = (WAL_SEGMENT_SIZE - (self.lsn % WAL_SEGMENT_SIZE)) as u32; | ||
trace!("saw xlog switch record at {}", self.lsn); | ||
self.padlen = (WAL_SEGMENT_SIZE - (self.lsn.0 % WAL_SEGMENT_SIZE)) as u32; | ||
} | ||
|
||
if self.lsn % 8 != 0 { | ||
self.padlen = 8 - (self.lsn % 8) as u32; | ||
// FIXME: what does this code do? | ||
if self.lsn.0 % 8 != 0 { | ||
self.padlen = 8 - (self.lsn.0 % 8) as u32; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could someone educate me about what's happening here? It looks like:
- If "switch record" (what's that?), then compute padding to the end of the segment.
- If the current LSN isn't a multiple of 8, then compute padding to the next 8-byte boundary, overwriting the padding computed above (?!)
How much of this should be captured into Lsn
member functions? We could add something like Lsn::pad_to(chunk_size: u64)
that could be used for both.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WAL in postgres is splitted into segments. Default size is 16Mb, but unlike XLOGBCLSZ is can be easily changed.
Switching XLOG means start writing to the next segment file.
Usually XLOG seitch record is not needed: WAL segments is filled till the end and then we swutch to next segment. But in some cases we may need to force switching to the new segment although there is still space in the current segment. It may be needed for WAL archiving.
WAL records are aligned on 8 byte boundary. Padding is used to enforce such alignment.
Actually the code fragment can be simplified as:
self.padlen = (-self.lsn.0 & 7) as u32;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you want to see an xlog switch in action, you can force it with select pg_switch_wal();
. Can become handy in testing. See https://www.postgresql.org/docs/current/functions-admin.html.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it a bug that the code overwrites the "switch record" padding with the 8-byte wal record padding?
This type is a zero-cost wrapper for a u64, meant to help code communicate with precision what that value means. It implements Display and Debug. Display "{}" will format as "1234ABCD:5678CDEF" while Debug will format as Lsn{1234567890}.
SeqWait can use any type that is Ord + Debug + Copy. Debug is not strictly necessary, but allows us to keep the panic message if a caller wants the sequence number to go backwards.
Use the `Lsn` type everywhere that I can find u64 being used to represent an LSN.
There is only one place doing subtraction, and it had a manually implemented check.
AtomicLsn is a wrapper around AtomicU64 that has load() and store() members that are cheap (on x86, anyway) and can be safely used in any context. This commit uses AtomicLsn in the page cache, and fixes up some downstream code that manually implemented LSN formatting. There's also a bugfix to the logging in wait_lsn, which prints the wrong lsn value.
There's no clear way to sum LSNs across timelines, so just remove them for now.
Replace open-coded math with member fns.
This type is a zero-cost wrapper for a u64, meant to help code communicate with precision what that value means, and prevent any type mix-ups.
It implements
Display
andDebug
.Display
will format as1234ABCD:5678CDEF
whileDebug
will format asLsn(1234567890)
.This isn't ready to merge; it's for code review purposes only.
There are lots of FIXME comments that want feedback.
Fixes #49.